Synopsis: Metadata Tribbles

In this lesson, we will see a case showing how a database design creates more work.

In this lesson, you will see a case showing how a database design creates more work.

Creating the Customers table#

My friend worked for years as a programmer in Oracle PL/SQL and Java. She described a case that showed how a database design that was intended to simplify work instead created more work.

The Sales division at her company maintained a table Customers that kept data such as customers’ contact information, their business type, and how much revenue had been received from each customer:

Creating Customers table

Adding a series of columns for each year#

But the Sales division needed to break down the revenue by year so they could keep track of the currently active customers. They decided to add a series of new columns, each column’s name indicating the year it covered:

Creating columns for each year

Then, they only entered data for those customers who they thought were interesting to track. On most rows, they left null in the revenue columns. The programmers started wondering whether they could store other information in these mostly unused columns.

Each year, they needed to add one more column. A database administrator was responsible for managing Oracle’s tablespaces. So each year, they had a series of meetings, scheduled a data migration to restructure the tablespace, and added the new column. Ultimately they wasted a lot of time and money.

Objective: Support scalability#

Performance degrades for any database query as the volume of data goes up. Even if a query returns results promptly with a few thousand rows, the tables naturally accumulate data to the point where the same query may not have acceptable performance. Using indexes intelligently helps, but the growth of tables inevitably affects the speed of queries against them.

The objective is to structure a database to improve the performance of queries and support tables that grow steadily.

Legitimate uses of the antipattern#

One good use of manually splitting tables is archiving — removing historical data from day-to-day use. The need to run queries against historical data is often greatly reduced after the data is no longer current. If we don’t need to query current and historical data together, it’s appropriate to copy the older data to another location and delete it from the active tables. Archiving keeps the data in a compatible table structure for occasional analysis but allows queries against current data to run with greater performance.

Sharding Databases at WordPress.com

Solution: Create Dependent Table
Antipattern: Clone Tables or Columns
Mark as Completed
Report an Issue